首页> 外文OA文献 >Performance and scalability of indexed subgraph query processing methods
【2h】

Performance and scalability of indexed subgraph query processing methods

机译:索引子图查询处理方法的性能和可伸缩性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Graph data management systems have become very popular\udas graphs are the natural data model for many applications.\udOne of the main problems addressed by these systems is subgraph\udquery processing; i.e., given a query graph, return all\udgraphs that contain the query. The naive method for processing\udsuch queries is to perform a subgraph isomorphism\udtest against each graph in the dataset. This obviously does\udnot scale, as subgraph isomorphism is NP-Complete. Thus,\udmany indexing methods have been proposed to reduce the\udnumber of candidate graphs that have to underpass the subgraph\udisomorphism test. In this paper, we identify a set of\udkey factors-parameters, that influence the performance of\udrelated methods: namely, the number of nodes per graph,\udthe graph density, the number of distinct labels, the number\udof graphs in the dataset, and the query graph size. We then\udconduct comprehensive and systematic experiments that analyze\udthe sensitivity of the various methods on the values of\udthe key parameters. Our aims are twofold: first to derive\udconclusions about the algorithms’ relative performance, and,\udsecond, to stress-test all algorithms, deriving insights as to\udtheir scalability, and highlight how both performance and\udscalability depend on the above factors. We choose six wellestablished\udindexing methods, namely Grapes, CT-Index,\udGraphGrepSX, gIndex, Tree+∆, and gCode, as representative\udapproaches of the overall design space, including the\udmost recent and best performing methods. We report on\udtheir index construction time and index size, and on query\udprocessing performance in terms of time and false positive\udratio. We employ both real and synthetic datasets. Specifi-\udcally, four real datasets of different characteristics are used:\udAIDS, PDBS, PCM, and PPI. In addition, we generate a\udlarge number of synthetic graph datasets, empowering us to\udsystematically study the algorithms’ performance and scalability\udversus the aforementioned key parameters.
机译:图数据管理系统已经变得非常流行。\ udas图是许多应用程序的自然数据模型。\ ud这些系统解决的主要问题之一是子图\ udquery处理;即,给定查询图,返回包含该查询的所有\ udgraph。处理此类查询的幼稚方法是对数据集中的每个图形执行子图同构\ udtest。这显然没有标度,因为子图同构是NP-Complete。因此,已经提出了许多索引方法来减少必须通过子图/同构测试的候选图的数量。在本文中,我们确定了一组\ udkey因子参数,这些参数会影响\ ud相关方法的性能:即每个图的节点数,\ ud的图密度,不同标签的数量,图中的\ udof图的数量。数据集和查询图的大小。然后,我们\进行全面,系统的实验,以分析\对各种关键参数值的敏感性。我们的目标是双重的:首先得出关于算法相对性能的结论,然后对所有算法进行压力测试,得出关于其可扩展性的见解,并强调性能和可扩展性如何取决于上述因素。 。我们选择六种完善的\ udindexing方法,例如Grapes,CT-Index,\ udGraphGrepSX,gIndex,Tree + ∆和gCode,作为总体设计空间的代表性\ udapp办法,包括\最新和性能最佳的方法。我们报告其索引构建时间和索引大小,并根据时间和误报率对查询\ udprocessing性能进行报告。我们使用真实和综合数据集。具体来说,使用了四个具有不同特征的真实数据集:udAIDS,PDBS,PCM和PPI。此外,我们生成了大量的合成图数据集,从而使我们能够\系统地研究算法的性能和可伸缩性\与上述关键参数进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号